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Abstract 

We  prove  the  soundness  of  a  polymorphic  type  system  for  a  language  with 
variables,  assignments,  and  first-class  functions.  As  a  corollary,  this  proves  the 
soundness  of  the  Edinburgh  LCF  ML  rules  for  typing  variables  and  assignments, 
thereby  settling  a  long-standing  open  problem. 

Keywords:  Type  theory,  formal  semantics,  variables  and  assignment. 


1  Introduction 

A  type  system  is  presented  for  a  language  with  a  letvar  construct  to  allocate  variables, 
which  are  implicitly  dereferenced  and  whose  addresses  are  not  first-class  values,  as  in 
traditional  imperative  languages.  Edinburgh  LCF  ML  [GMW78]  had  such  a  construct, 
which  it  called  letref .  We  show  that  the  restriction  that  a  variable  must  have  weak 
type  only  if  it  is  assigned  to  inside  a  A-abstraction  within  its  scope  is  sound.  As  a 
corollary  then,  LCF  ML  restriction  2(i)(b)  (pg.  49  [GMW78]),  which  requires  a  variable 
to  have  a  monotype  (a  type  with  no  type  variables)  if  the  variable  is  assigned  to  inside 
a  A-abstraction  within  its  scope,  is  also  sound  since  every  monotype  is  weak.  This 
restriction  was  never  proved  sound,  according  to  Tofte  [Tof90]. 


2  The  Type  System 

The  syntax  of  the  language  we  consider  is  core  ML  with  a  letvar  construct  and  as¬ 
signment.  Following  Tofte  [Tof90],  we  distinguish  a  subset  of  the  expressions  called 
Values: 

^Appeared  in  Information  Processing  Letters,  56(3),  November  1995,  pp. 141— 146.  This  material 
is  based  upon  activities  supported  by  the  National  Science  Foundation  under  Agreements  No.  CCR- 
9400592  and  CCR-9414421.  Any  opinions,  findings,  and  conclusions  or  recommendations  expressed  in 
this  publication  are  those  of  the  authors  and  do  not  necessarily  reflect  the  views  of  the  National  Science 
Foundation. 
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( Expressions )  e  ::=  v  \  l  \  e\  e^  |  ei  :=  e-’i  \ 
let  x  =  e\  in  €2  | 

letvar  x  :=  e\  in  e^ 

( Values )  v  ::=  x  |  unit  |  Xx.e 

Meta-variable  x  ranges  over  identifiers.  The  letvar  construct  binds  x  to  a  new  cell 
initialized  to  the  value  of  e\.  The  scope  of  the  binding  is  e^  and  the  lifetime  of  the  cell 
is  unbounded.  Dereferencing  of  variables  created  with  letvar  is  implicit.  Locations  are 
denoted  by  meta-variable  l  and  are  not  values. 

The  types  of  the  language  are  stratified  as  follows. 

r  ::=  a  |  unit  |  r  — >■  r'  ( data  types) 
a  ::=  Va  .  a  |  r  ( type  schemes ) 

p  ::=  <7  |  r  var  ( phrase  types) 

The  meta-variable  a  ranges  over  type  variables.  Type  variables  are  partitioned  into  weak 
and  strong  type  variables,  written  _a  and  a  respectively.  These  variables  correspond  to 
the  imperative  and  applicative  type  variables  respectively  of  Tofte’s  system.  We  say 
that  a  type  scheme  a  is  weak  iff  a  is  unquantified  and  every  type  variable  in  a  is  weak. 
Type  r  var  is  the  type  of  locations  storing  values  of  type  r. 

The  rules  of  the  type  system  are  formulated  as  they  are  in  Harper’s  system  [Har94] 
and  are  given  in  Figure  1.  It  is  a  deductive  proof  system  used  to  assign  types  to 
expressions.  Typing  judgements  have  the  form 

\;j\~e:p 

meaning  that  expression  e  has  type  p  assuming  that  A  prescribes  type  schemes  for 
locations  in  e  and  7  prescribes  phrase  types  for  the  free  identifiers  of  e.  Meta- variable 
7  ranges  over  identifier  typings.  An  identifier  typing  7  is  a  finite  function  mapping 
identifiers  to  phrase  types;  j(x)  is  the  phrase  type  assigned  to  x  by  7  and  j[x  :  p] 
assigns  phrase  type  p  to  x  and  to  variable  x'  x,  phrase  type  j(x'). 

Meta-variable  A  ranges  over  location  typings.  Unlike  other  approaches  [Tof90,  Har94, 
SmVo95],  a  location  typing  here  is  a  finite  function  mapping  locations  to  type  schemes. 
This  is  the  most  novel  aspect  of  the  type  system.  The  notational  conventions  for  location 
typings  are  similar  to  those  for  identifier  typings. 

The  generalization  of  a  type  scheme  a  relative  to  A  and  7,  written  Close a;7(c),  is 
the  type  scheme  Vd  .  a,  where  a  is  the  set  of  all  type  variables  occurring  free  in  a  but 
not  in  A  or  in  7.  We  write  A  b  e  :  r  and  Close\(cr)  when  7  =  0.  A  restricted  form  of 
generalization,  written  AppClose^(i 7),  is  defined  to  be  the  same  as  Close \t-y(cr)  except 
that  only  strong  type  variables  are  generalized;  any  weak  ones  remain  free.  As  in  Tofte 
[Tof90],  the  generic  instance  relation  (>)  of  Dam  as  and  Milner  [DaM82]  is  restricted  by 
requiring  universally  quantified  weak  type  variables  to  be  instantiated  only  with  weak 
types. 

Finally,  we  write  A;  7  b  e  :  a  iff  A;  7  b  e  :  r  whenever  <7  >  r. 


3  Semantics  and  Soundness 

In  this  section,  we  establish  type  soundness  using  the  framework  of  Harper  [Har94].  First 
we  give  a  structured  operational  semantics  for  the  language.  An  expression  is  evaluated 
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(var) 

(ident) 

(VARLOC) 

(unit) 

(—►-intro) 

(— >-elim) 

(let-val) 

(let-ord) 

(letvar) 

(r-val) 

(assign) 


A;  7  b  x  :  t  var  j(x)  =  r  var 

A;  7  b  x  :  t  l{x)  >  r 

A;  7  b  /  :  r  var  A (/)  >  r 

A;  7  b  unit  :  umi 

\;j[x  :  n]  b  e  :  r2 
A;  7  b  Xx.  e  :  T\  —>■  T2 

A;  7  b  ei  :  77  — >■  r2,  A;7be2:ri 
A;  7  b  ei  e2  :  r2 

A;  7  b  v  :  Ti,  A;  7[t  :  C7oseA;7(ri)]  b  e  :  r2 
A;  7  b  let  t  =  v  in  e  :  r2 

A;  7  b  ei  :  77,  A;  7(3?  :  ^ffjiC,/oseA;7(Ti)]  b  e2  :  r2 
A;  7  b  let  x  =  ei  in  e2  :  r2 

A;  7  b  ei  :  Ti,  A;  ~f[x  :  n  var]  b  e2  :  r2 

If  t  is  assigned  to  in  a  A-abstraction  in  e2  then  ti  is  weak. 
A;  7  b  letvar  x  :=  e\  in  e2  :  r2 

A;  7  b  e  :  r  var 
A;  7  b  e  :  t 

A;  7  b  e\  :  t  var,  A;  7  b  e2  :  r 
A;  7  b  ei  :=  e2  :  umi 

Figure  1:  Rules  of  the  Type  System 
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(val) 

p  \-  V  =>  V,  p 

(deref) 

p\-  l  =>  p(l),  p 

(apply) 

p  b  e\  =>  Xx.  e\' ,  pi 

A*i  b  e2  =>■  v2,P2 
p2  b  [v2/x\ei  v,  p' 

p  bej  e2^i),  p' 

(update) 

p  b  e  =>  v,  p' 

/ib(:=e=>  unit,  p'[l  :=  i>] 

(bind) 

P  b  ei  =>-  vi,pi 

pi  b  [vi/x\e2  =>  v2,P2 

p  b  let  x  =  ei  in  e2  =>  v2,P2 

(bindvar) 

P  b  ei  =>-  vi,pi 

I  ^  dom(ni) 

lii[l  :=  vi\  b  [l/x\e2  =>  p2,/»2 


p  b  letvar  x  :=  e\  in  e2  =>■  ^2,  4*2 


Figure  2:  The  Evaluation  Rules 


relative  to  a  memory  p ,  which  is  a  finite  function  from  locations  to  values.  The  contents 
of  a  location  l  £  dom(p)  is  the  value  p(l),  and  we  write  p[l  :=  i>]  for  the  memory  that 
assigns  value  v  to  location  /,  and  value  p(l')  to  a  location  V  7^  l.  Note  that  p[l  :=  i>]  is 
an  update  of  p  if  /  £  dom(p)  and  an  extension  of  p  if  /  ^  dom(p).  The  range  of  p  is  the 
set  of  all  values  p{l),  for  l  £  dom(p). 

Our  evaluation  rules  are  given  in  Figure  2.  They  allow  us  to  derive  judgements  of 
the  form 

pi  b  e  =>  v,  p 

which  is  intended  to  assert  that  evaluating  closed  expression  e  in  memory  fi  results  in 
value  v  and  new  memory  p' .  We  write  [e'/*]e  to  denote  the  capture-avoiding  substitution 
of  e'  for  all  free  occurrences  of  x  in  e.  The  use  of  substitution  in  the  rules  allows  us  to 
avoid  environments  and  closures  in  the  semantics,  so  that  the  result  of  evaluating  an 
expression  is  just  another  expression. 

The  basic  idea  behind  showing  soundness  is  to  show  that  if  b  e  :  r  and  b  e  =>  v,  p' , 
then  b  v  :  r,  a  property  called  subject  reduction.  But  since  e  can  allocate  locations  and 
since  these  locations  can  occur  in  v,  the  conclusion  must  actually  be  that  there  exists  a 
location  typing  A'  such  that  A'  b  v  :  r  and  such  that  p'  :  A'.  The  latter  condition  asserts 
that  A'  is  consistent  with  p' .  More  precisely  we  say  that  yU  :  A  iff  dom(p)  =  dom( A)  and 
for  every  l  £  dom(p),  A  b  p(l)  :  A(7). 

It  is  the  location  typing  A'  that  makes  soundness  delicate.  As  observed  by  Tofte 
[Tof90],  we  may  generalize  a  type  variable  a  in  typing  b  e  :  r,  only  to  find  that  a  occurs 
free  in  A',  and  therefore  cannot  be  generalized  in  typing  A'  b  v  :  r.  For  example,  we  can 
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define  list  reversal  as  follows: 


letvar  r  :=  Xx.  x  in 

r  :=  Xx.  if  x  =  []  then  []  else  (r  (tl  x))  @  [hd  x]; 
r 

end 

This  expression  has  type  Va  .  a  list  —>■  a  list  in  our  type  system.  But  when  the  expression 
is  evaluated,  a  location  l  of  type  a  list  —>■  a  list  is  allocated  for  r  and  l  appears  in  the 
resulting  value  as  well  as  in  the  domain  of  the  resulting  location  typing  A'. 

The  solution  proposed  here  is  to  use  the  quantified  type  Va  .  a  list  —>■  a  list  for  l  in 
A',  thereby  eliminating  the  free  occurrence  of  a.  Of  course,  it  is  not  always  reasonable  to 
give  a  location  a  quantified  type.  For  example,  if  A(7)  =  Va  .  a  —>  a,  then  the  program 
l  :=  not ;  l  can  be  given  type  int  —>■  int,  yet  it  evaluates  to  not  of  type  bool  —>■  bool.  Our 
subject  reduction  theorem  allows  only  read-only  locations  to  be  given  quantified  types. 
We  now  turn  to  the  soundness  proof.  First  we  introduce  the  relevant  lemmas. 

Lemma  3.1  (Superfluousness)  Suppose  that  A;  7  b  e  :  r.  If  l  ^  dom(X),  then  A [/  : 
<7];  7  b  e  :  t  and  if  x  £  dom( 7),  then  A;  j[x  :  a]  b  e  :  r. 

Lemma  3.2  (Substitution)  //A;  7  b  v  :  a  and  A ;  7[ac  :  a]  b  e  :  t,  then  A;  7  b  \v/x\e  : 
r.  Also,  */A;7  h  /  :  r  var  and  A;  j[x  :  r  var]  b  e  :  t' ,  then  A;  7  b  [l/x\e  :  r'. 

The  preceding  two  lemmas  are  straightforward  variants  of  the  lemmas  given  in  [Har94]. 
We  also  need  two  new  lemmas: 

Lemma  3.3  (Strengthening)  If  X[l  :  07]  b  e  :  a  and  07  >  07  then  A [/  :  a 2]  b  e  :  a. 

Lemma  3.4  (V-intro)  If  Abe  :  a  and  a  does  not  occur  free  tn  X,  then  Abe:  Va  .  a. 

Finally,  we  note  that  in  spite  of  (r-val)  our  typing  rules  are  essentially  “syntax  directed”  : 

Lemma  3.5  ((R-VAL)-scope)  If  the  derivation  of  A;y  b  e  :  r  ends  with  (r-val),  then 
e  is  an  identifier  or  a  location. 

Proof.  If  the  derivation  ends  with  (r-val),  then  there  must  be  a  derivation  of  the  hy¬ 
pothesis  A;  7  b  e  :  r  var.  But  to  show  that  an  expression  has  a  type  of  the  form  r  var , 
there  are  only  two  possible  rules  that  can  be  used:  (var)  and  (varloc).  (The  other 
rules  all  give  data  types  to  expressions.)  So  e  must  either  be  an  identifier,  in  the  case  of 
(var),  or  a  location,  in  the  case  of  (varloc).  □ 

We  now  give  the  soundness  theorem: 

Theorem  3.6  (Subject  Reduction)  Suppose 

(a)  /ibe=>i),  qi' , 

(b)  A  b  e  :  r, 

(c)  fi  :  A,  and 

(d)  if  a  location  l  is  assigned  to  in  e,  then  X (l)  is  unquantified;  also,  if  l  is  assigned  to 
in  the  range  of  qi  or  in  a  X-abstraction  m  e,  then  X (/)  is  weak. 
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Then  there  exists  A'  such  that 
(e)  A  C  A', 

(/) 

(g)  A'l ~  v  :t, 

(h)  any  strong  type  variable  free  in  A'  is  free  m  A,  and 

(i)  if  a  location  l  is  assigned  to  m  v  or  m  the  range  of  p! ,  then  A'(7)  is  weak. 

Proof  The  proof  is  by  induction  on  the  structure  of  the  derivation  of  fi  b  e  =>•  v,fi' . 
Due  to  space  limitations,  we  present  only  the  most  interesting  cases,  (update)  and 
(bindvar).  We  remark  that  property  (h)  above  makes  the  (bind)  case  routine. 
(update).  The  evaluation  must  end  with 

fi  b  e  =>•  v,  ff 

/ibb=e=>  unit,  //[/  :=  i>] 

and,  by  Lemma  3.5,  the  typing  must  end  with 

Ab/:r  var,  A  b  e  :  r 
A  b  l  :=  e  :  unit 

Also,  fi  :  A,  A (/)  is  unquantihed,  and  if  a  location  V  is  assigned  to  in  e,  then  A (/')  is 
unquantihed.  And  if  V  is  assigned  to  in  the  range  of  /a  or  in  a  A-abstraction  in  e,  then 
A (/')  is  weak.  By  induction,  there  exists  A'  such  that 

(e)  AC  A', 

(/)  A*':*', 

(g)  A'  b  v  :t, 

(h)  any  strong  type  variable  free  in  A'  is  free  in  A,  and 

(i)  if  a  location  l'  is  assigned  to  in  v  or  in  the  range  of  p! ,  then  A '(/')  is  weak. 

Now  we  must  show 

(/)  n'[l  :=  v]  :  A', 

( g )  A'  b  unit  :  unit, 

(i)  if  a  location  l'  is  assigned  to  in  unit  or  in  the  range  of  fi'[l  :=  v], 
then  A '(/')  is  weak. 

(g)  follows  immediately  from  typing  rule  (unit),  (i)  follows  by  induction,  since  if  a 
location  l'  is  assigned  to  in  the  range  of  fi'[l  :=  i>]  then  it  is  assigned  to  in  v  or  in  the 
range  of  p! .  Finally,  we  consider  (/),  the  most  interesting  case.  For  every  V  £  dom(ia') 
and  V  fz  l,  we  have 

A'  b  n'[l  :=  v](l')  :  A'(/') 

by  induction.  Since  A  b  1  :  r  var ,  A (/)  >  r.  But  since  A (/)  is  unquantified,  A (/)  =  r  and 
therefore  A '(/)  =  r  since  A  C  A'.  Since,  by  induction,  A'  b  v  :  r,  we  have 

A'  b  n'[l  :=  p](/)  :  A'(0 

Thus  we  have  n'[l  :=  i>]  :  A'.  This  completes  (update). 
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Notice  the  role  of  condition  (d)  in  proving  A'  b  fi'[l  :=  i>](/)  :  A '(/)  above.  Since  l  is 
assigned  to  in  l  :=  e,  A (7)  must  be  unquantified  and  consequently  has  only  one  generic 
instance,  namely  r.  Therefore,  A'  b  fi'[l  :=  i>](/)  :  A'(7)  follows  directly  from  A'  b  v  :  r  of 
the  induction.  If  A(7)  were  quantified,  then  it  would  not  be  possible  to  show  A'  b  v  :  A '(/). 
For  example,  if  A (7)  =  Va  .  a  —>  a,  then  on  the  program  l  :=  not  we  would  have  to  show 
that  not  has  type  Va  .  a  — *■  a. 

(bindvar).  The  evaluation  must  end  with 

A*  b  ei  =>•  tq,  iii 
l  ^  dom(ni) 

lii[l  :=  vi]  b  [l/x]e2  =>  V2,H2 
fi  b  letvar  x  :=  e\  in  =>■  v2, 

and,  by  Lemma  3.5,  the  typing  must  end  with 

A  b  e\  :  Ti 

A;  [x  :  T\  van ]  b  :  t2 

If  x  is  assigned  to  in  a  A-abstraction  in  then  T\  is  weak. 

A  b  letvar  x  :=  e\  in  '■  72 

Also,  n  :  A  and  if  a  location  V  is  assigned  to  in  e\  or  in  e2,  then  A (/')  is  unquantihed. 
And  if  V  is  assigned  to  in  the  range  of  g  or  in  a  A-abstraction  in  e\  or  in  e2,  then  A (/') 
is  weak.  By  induction,  there  exists  Ai  such  that 

(e)  A  C  A1; 

(/)  A*i  :  Ai, 

(g)  Ai  b  iq  :  n, 

( h )  any  strong  type  variable  free  in  Ai  is  free  in  A,  and 

(i)  if  a  location  /'  is  assigned  to  in  iq  or  in  the  range  of  fi i,  then  Xi(l')  is  weak. 

Since  l  ^  dom( Ai),  Ai  C  Ai  [/  :  ri].  Now,  since  Ai [/  :  ri]  b  /  :  T\  van  and,  by  Lemma  3.1, 
Ai [/  :  ri];  [x  :  T\  van ]  b  e2  :  T2,  we  can  apply  Lemma  3.2  to  get 

(b)  Ai[/  :  Ti]  b  [l/x]e2  :  r2 
We  also  have,  by  Lemma  3.1, 

(c)  ni[l  :=  iq]  :  Ai [/  :  n] 

Next,  if  a  location  V  is  assigned  to  in  [l/x\e 2,  then  either  V  is  assigned  to  in  or  V  =  l. 
In  the  first  case  we  have  that  A (/')  is  unquantihed  by  hypothesis,  and  so  Ai[/  :  Ti](/')  is 
unquantihed.  In  the  second  case  we  have  Ai[/  :  ti](7)  =  T\,  which  is  unquantihed.  Also, 
if  V  is  assigned  to  in  the  range  of  fii[l  :=  iq],  then  /'  is  assigned  to  in  tq  or  in  the  range  of 
fi  1,  so  by  induction  Xi(l')  is  weak,  and  hence  Ai [/  :  Ti](/')  is  weak,  since  Ai  C  Ai[/  :  7q]. 
Finally,  if  /'  is  assigned  to  in  a  A-abstraction  in  [l/x\e 2,  then  either  V  is  assigned  to  in 
a  A-abstraction  in  or  V  =  l  and  x  is  assigned  to  in  a  A-abstraction  in  In  the  hrst 
case,  A (/')  is  weak  by  hypothesis,  and  so  Ai[/  :  Ti](/')  is  weak.  In  the  second  case,  we 
have  T\  is  weak  by  the  restriction  on  the  (letvar)  rule,  and  so  Ai[/  :  Ti](/')  is  weak. 
Therefore,  we  have 
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(d)  if  a  location  /'  is  assigned  to  in  [l/x\e 2,  then  Ai  [/  :  ti](7')  is  unquantified;  also, 
if  V  is  assigned  to  in  the  range  of  fii[l  :=  iq]  or  in  a  A-abstraction  in  [l/x\e 2, 
then  Ai[/  :  ri](7')  is  weak. 

So  by  a  second  use  of  induction,  there  exists  A2  such  that 

(e)  Ai[/  :  ri]  C  A2, 

(/)  L2  '■  A2, 

(. 9 )  A2  b  v2  :  r2, 

( h )  any  strong  type  variable  free  in  A2  is  free  in  Ai[/  :  ti],  and 

(i)  if  a  location  /'  is  assigned  to  in  i>2  or  in  the  range  of  02;  then  A 2(//)  is  weak. 

At  this  point,  A2  may  contain  free  strong  type  variables  that  are  not  free  in  A,  namely 
those  of  T\.  So  we  cannot  take  A2  as  our  final  location  typing.  Instead,  define  A'  by 

A'(/')  =  AppClose^O')), 

for  all  /'  G  dom( A2).  Now  we  must  establish 

(e)  AC  A', 

(/)  ■  A', 

(. 9 )  A'  b  v2  -  r2, 

(h)  any  strong  type  variable  free  in  A'  is  free  in  A,  and 

(i)  if  a  location  /'  is  assigned  to  in  i>2  or  in  the  range  of  //2,  then  A '(/')  is  weak. 

To  show  (e),  note  that  for  any  /'  G  dom( A),  A'(7')  =  A2(//),  by  the  definition  of  A'. 
Since  A  C  A2,  it  follows  that  A  C  A'. 

Next  we  show  (/).  Since  H2  '■  A2,  we  have 

A2  h  1^2(1')  '■  Mi') 

for  any  /'  G  (J»m(/J2).  Since  AppClosex(a )  >  a  for  every  a,  by  applying  Lemma  3.3 
repeatedly  we  get 

A'  b  ^(0  :  A2(/') 

Finally,  from  Lemma  3.4  we  get 

A'  b  02(/')  :  AppCloSex(\2(l')), 

since  any  type  variables  thereby  quantified  do  not  occur  free  in  A'.  Hence  02  :  A'. 

To  get  ( g ),  apply  Lemma  3.3  to  A2  b  i>2  :  "C-  And  (h)  follows  immediately  from  the 
definition  of  A'.  Finally,  for  (i)  suppose  that  /'  is  assigned  to  in  i>2  or  in  the  range  of 
02-  By  the  second  use  of  induction,  A 2(//)  is  weak.  Hence  A '(/')  is  weak,  since  AppClose 
quantihes  only  strong  type  variables.  This  completes  (bindvar).  □ 

Corollary  3.7  Restriction  2(i)(b)  of  LCF  ML  [GMW78],  requiring  a  variable  to  have 
a  monotype  if  the  variable  is  assigned  to  in  a  X-abstraction  within  its  scope,  is  sound. 

Proof.  A  monotype  is  a  type  with  no  type  variables,  so  every  such  type  is  weak.  So  by 
Theorem  3.6,  the  LCF  ML  restriction  is  sound.  □ 
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